perm filename CHAP4[4,KMC]27 blob
sn#074496 filedate 1973-11-26 generic text, type T, neo UTF8
00100 LANGUAGE-RECOGNITION PROCESSES FOR UNDERSTANDING DIALOGUES
00200 IN TELETYPED PSYCHIATRIC INTERVIEWS
00300
00400 Since the behavior being simulated by this paranoid model is
00500 the sequential language-behavior of a paranoid patient in a
00600 psychiatric interview, the model (PARRY) must have an ability to
00700 interpret and respond to natural language input to a degree
00800 sufficient to demonstrate conduct characteristic of the paranoid
00900 mode. By "natural language" I shall mean ordinary American
01000 English such as is used in everyday conversations. It is still
01100 difficult to be explicit about the processes which enable humans to
01200 interpret and respond to natural language. ("A mighty maze ! but
01300 not without a plan." - A. Pope). Philosophers, linguists and
01400 psychologists have investigated natural language with various
01500 purposes. Few of the results have been useful to builders of
01600 interactive simulation models. Attempts have been made in artificial
01700 intelligence to write algorithims which "understand" teletyped
01800 natural language expressions. (Colby and Enea,1967; Enea and
01900 Colby,1973; Schank, Goldman, Rieger, and Riesbeck,1973;
02000 Winograd,1973; Woods, 1970). Computer understanding of natural
02100 language is actively being attempted today but it is not something to
02200 be completly achieved today or even tomorrow. For our model the
02300 problem was not to find immediately the best way of doing it but to
02400 find any way at all. We sought pragmatic feasibility, not instant
02500 optimality.
02600 During the 1960's when machine processing of natural language
02700 was dominated by syntactic considerations, it became clear that
02800 syntactical information alone was insufficient to comprehend the
02900 expressions of ordinary conversations. A current view is that to
03000 understand what information is contained in linguistic expressions,
03100 knowledge of syntax and semantics must be combined with beliefs from
03200 a conceptual structure capable of making inferences. How to achieve
03300 this combination efficiently with a large data-base represents a
03400 monumental task for both theory and implementation.
03500 Seeking practical performance, we did not attempt to
03600 construct a conventional linguistic parser to analyze conversational
03700 language of interviews. Parsers to date have had great difficulty in
03800 performing well enough to assign a meaningful interpretation to the
03900 expressions of everyday conversational language in unrestricted
04000 English. Purely syntactic parsers offer a cancerous proliferation
04100 of interpretations. A conventional parser, lacking neglecting and
04200 ignoring mechanisms, may simply halt when it comes across a word not
04300 in its dictionary. Parsers represent tight conjunctions of tests
04400 instead of loose disjunctions needed for gleaning some degree of
04500 meaning from everyday language communication. It is easily observed
04600 that people misunderstand and ununderstand at times and thus remain
04700 partially opaque to one another, a truth which lies at the core of
04800 human life and communication.
04900 How language is understood depends on how people interpret
05000 the meanings of situations they find themselves in. In a dialogue,
05100 language is understood in accordance with a participant's view of the
05200 situation. The participants are interested in both what an utterance
05300 means (what it refers to) and what the utterer means ( his
05400 intentions). In a first psychiatric interview the doctor's intention
05500 is to gather certain kinds of information; the patient's intention is
05600 to give information in order to receive help. Such an interview is
05700 not small talk; a job is to be done. Our purpose was to develop a
05800 method for recognizing sequences of everyday English sufficient for
05900 the model to communicate linguistically in a paranoid way in the
06000 circumscribed situation of a psychiatric interview.
06100 We did not try to construct a general-purpose algorithm which
06200 could understand anything said in English by anybody to anybody else
06300 in any dialogue situation. (Does anyone believe it to be currently
06400 possible? The seductive myth of generalization can lead to
06500 trivialization). We sought simply to extract some degree of, or
06600 partial idiosyncratic, idiolectic meaning (not the "complete"
06700 meaning, whatever that means) from the input. We utilized a
06800 pattern-directed, rather than a parsing-directed, approach because of
06900 the former's power to ignore irrelevant and unintelligible details.
07000 Natural language is not an agreed-upon universe of discourse
07100 such as arithmetic, wherein symbols have a fixed meaning for everyone
07200 who uses them. What we loosely call "natural language" is actually a
07300 set of history-dependent, selective, and interest-oriented idiolects,
07400 each being unique to the individual with a unique history. (To be
07500 unique does not mean that no property is shared with other
07600 individuals, only that not every property is shared). It is the broad
07700 overlap of idiolects which allows the communication of shared
07800 meanings in everyday conversation.
07900 We took as pragmatic measures of "understanding" the ability
08000 (1) to form a conceptualization so that questions can be answered and
08100 commands carried out, (2) to determine the intention of the
08200 interviewer, (3) to determine the references for pronouns and other
08300 anticipated topics. This straightforward approach to a complex
08400 problem has its drawbacks, as will be shown. We strove for a highly
08500 individualized idiolect sufficient to demonstrate paranoid processes
08600 of an individual in a particular situation rather than for a general
08700 supra-individual or ideal comprehension of English. If the
08800 language-recognition processes of PARRY were to interfere with
08900 demonstrating the paranoid processes, we would consider them
09000 defective and insufficient for our purposes.
09100 The language-recognition process utilized by PARRY first puts
09200 the teletyped input in the form of a list and then determines the
09300 syntactic type of the input expression - question, statement or
09400 imperative by looking at introductory terms and at punctuation. The
09500 expression-type is then scanned for conceptualizations, i.e. patterns
09600 of contentives consisting of words or word-groups, stress-forms of
09700 speech having conceptual meaning relevant to the model's interests.
09800 The search for conceptualizations ignores (as irrelevant details)
09900 function or closed-class terms (articles, auxiliaries, conjunctions,
10000 prepositions, etc.) except as they might represent a component in a
10100 contentive word-group. For example, the word-group (for a living) is
10200 defined to mean `work' as in "What do you do for a living?" The
10300 conceptualization is classified according to the rules of Fig. 1 as
10400 malevolent, benevolent or neutral. Thus PARRY attempts to judge the
10500 intention of the utterer from the content of the utterance.
10600 (INSERT FIG.1 HERE)
10700 Some special problems a dialogue algorithm must handle in a
10800 psychiatric interview will now be outlined along with a brief
10900 description of how the model deals with them.
11000
11100 QUESTIONS
11200
11300 The principal expression-type used by an interviewer is a
11400 question. A question is recognized by its first term being a "wh-" or
11500 "how" form and/or an expression ending with a question-mark. In
11600 teletyped interviews a question may sometimes be put in declarative
11700 form followed by a question mark as in:
11800 (1) PT.- I LIKE TO GAMBLE ON THE HORSES.
11900 (2) DR.- YOU GAMBLE?
12000 Although a question-word or auxiliary verb is missing in (2), the
12100 model recognizes that a question is being asked about its gambling
12200 simply by the question mark.
12300 Particularly difficult are those `when' questions which
12400 require a memory which can assign each event a beginning, an end and
12500 a duration. An improved version of the model should have this
12600 capacity. Also troublesome are questions such as `how often', `how
12700 many', i.e. a `how' followed by a quantifier. If the model has "how
12800 often" on its expectancy list while a topic is under discussion, the
12900 appropriate reply can be made. Otherwise the model fails to
13000 understand.
13100 In constructing a simulation of symbolic processes it is
13200 arbitrary how much information to represent in the data-base, Should
13300 PARRY know which city is the capital of Alabama? It is trivial to
13400 store tomes of facts and there always will be boundary conditions.
13500 We took the position that the model should know only what we believed
13600 it reasonable to know relative to a few hundred topics expectable in
13700 a psychiatric interview. Thus PARRY performs poorly when subjected to
13800 baiting `exam' questions designed to test its informational
13900 limitations rather than to seek useful psychiatric information.
14000
14100 IMPERATIVES
14200
14300 Typical imperatives in a psychiatric interview consist of
14400 expressions like:
14500 (3) DR.- TELL ME ABOUT YOURSELF.
14600 (4) DR.- LETS DISCUSS YOUR FAMILY.
14700 Such imperatives are actually interrogatives to the
14800 interviewee about the topics they refer to. Since the only physical
14900 action the model can perform is to `talk' , imperatives are treated
15000 as requests for information. They are identified by the common
15100 introductory phrases: "tell me", "lets talk about", etc.
15200 DECLARATIVES
15300
15400 In this category is lumped everything else. It includes
15500 greetings, farewells, yes-no type answers, existence assertions and
15600 the usual predications.
15700
15800 AMBIGUITIES
15900
16000 Words have more than one sense, a convenience for human
16100 memories but a struggle for language-understanding algorithms.
16200 Consider the word "bug" in the following expressions:
16300 (5) AM I BUGGING YOU?
16400 (6) AFTER A PERIOD OF HEAVY DRINKING HAVE YOU FELT BUGS ON
16500 YOUR SKIN?
16600 (7) DO YOU THINK THEY PUT A BUG IN YOUR ROOM?
16700 In expression (5) the term "bug" means to annoy, in (6) it
16800 refers to an insect and in (7) it refers to a microphone used for
16900 hidden surveillence. PARRY uses context to carry out
17000 disambiguation. For example, when the Mafia is under discussion and
17100 the affect-variable of fear is high, the model interprets "bug" to
17200 mean microphone. In constructing this hypothetical individual we
17300 took advantage of the selective nature of idiolects which can have an
17400 arbitrary restriction on word senses. One characteristic of the
17500 paranoid mode is that regardless of what sense of a word the the
17600 interviewer intends, the patient may idiosyncratically interpret it
17700 as some sense of his own. This property is obviously of great help
17800 for an interactive simulation with limited language-understanding
17900 abilities.
18000 ANAPHORIC REFERENCES
18100 The common anaphoric references consist of the pronouns "it",
18200 "he", "him", "she", "her", "they", "them" as in:
18300 (8) PT.-HORSERACING IS MY HOBBY.
18400 (9) DR.-WHAT DO YOU ENJOY ABOUT IT?
18500 When a topic is introduced by the patient as in (8), a
18600 number of things can be expected to be asked about it. Thus the
18700 algorithm has ready an updated expectancy-anaphora list which allows
18800 it to determine whether the topic introduced by the model is being
18900 responded to or whether the interviewer is continuing with the
19000 previous topic.
19100 The algorithm recognizes "it" in (9) as referring to
19200 "horseracing" because a flag for horseracing was set when horseracing
19300 was introduced in (8), "it" was placed on the expected anaphora list,
19400 and no new topic has been introduced. A more difficult problem arises
19500 when the anaphoric reference points more than one I-O pair back in
19600 the dialogue as in:
19700 (10) PT.-THE MAFIA IS OUT TO GET ME.
19800 (11) DR.- ARE YOU AFRAID OF THEM?
19900 (12) PT.- MAYBE.
20000 (13) DR.- WHY IS THAT?
20100 The "that" of expression (13) does not refer to (12) but to
20200 the topic of being afraid which the interviewer introduced in (11).
20300 Another pronominal confusion occurs when the interviewer uses
20400 `we' in two senses as in:
20500 (14) DR.- WE WANT YOU TO STAY IN THE HOSPITAL.
20600 (15) PT.- I WANT TO BE DISCHARGED NOW.
20700 (16) DR.- WE ARE NOT COMMUNICATING.
20800 In expression (14) the interviewer is using "we" to refer to
20900 psychiatrists or the hospital staff while in (16) the term refers to
21000 the interviewer and patient. Identifying the correct referent would
21100 require beliefs about the dialogue itself.
21200
21300 TOPIC SHIFTS
21400
21500 In the main, a psychiatric interviewer is in control of the
21600 interview. When he has gained sufficient information about a topic,
21700 he shifts to a new topic. Naturally the algorithm must detect this
21800 change of topic as in the following:
21900 (17) DR.- HOW DO YOU LIKE THE HOSPITAL?
22000 (18) PT.- ITS NOT HELPING ME TO BE HERE.
22100 (19) DR.- WHAT BROUGHT YOU TO THE HOSPITAL?
22200 (20) PT.- I AM VERY UPSET AND NERVOUS.
22300 (21) DR.- WHAT TENDS TO MAKE YOU NERVOUS?
22400 (23) PT.- JUST BEING AROUND PEOPLE.
22500 (24) DR.- ANYONE IN PARTICULAR?
22600 In (17) and (19) the topic is the hospital. In (21) the topic
22700 changes to causes of the patient's nervous state.
22800 Topics touched upon previously can be re-introduced at any
22900 point in the interview. PARRY knows that a topic has been discussed
23000 previously because a topic-flag is set when a topic comes up.
23100
23200 META-REFERENCES
23300
23400 These are references, not about a topic directly, but about
23500 what has been said about the topic as in:
23600 (25) DR.- WHY ARE YOU IN THE HOSPITAL?
23700 (26) PT.- I SHOULDNT BE HERE.
23800 (27) DR.- WHY DO YOU SAY THAT?
23900 The expression (27 ) is about and meta to expression (26 ). The model
24000 does not respond with a reason why it said something but with a
24100 reason for the content of what it said, i.e. it interprets (27) as
24200 "why shouldn't you be here?"
24300 Sometimes when the patient makes a statement, the doctor
24400 replies, not with a question, but with another statement which
24500 constitutes a rejoinder as in:
24600 (28 ) PT.- I HAVE LOST A LOT OF MONEY GAMBLING.
24700 (29 ) DR.- I GAMBLE QUITE A BIT ALSO.
24800 Here the algorithm interprets (29 ) as a directive to
24900 continue discussing gambling, not as an indication to question the
25000 doctor about gambling.
25100
25200 ELLIPSES
25300
25400
25500 In dialogues one finds many ellipses, expressions from which
25600 one or more words are omitted as in:
25700 (30 ) PT.- I SHOULDNT BE HERE.
25800 (31) DR.- WHY NOT?
25900 Here the complete construction must be understood as:
26000 (32) DR.- WHY SHOULD YOU NOT BE HERE?
26100 Again, this is handled by the expectancy-anaphora list which
26200 anticipates a "why not".
26300 The opposite of ellipsis is redundancy which usually provides
26400 no problem since the same thing is being said more than once as in:
26500 (33 ) DR.- LET ME ASK YOU A QUESTION.
26600 The model simply recognizes (33) as a stereotyped pattern.
26700
26800 SIGNALS
26900
27000 Some fragmentary expressions serve only as directive signals
27100 to proceed, as in:
27200 (34) PT.- I WENT TO THE TRACK LAST WEEK.
27300 (35) DR.- AND?
27400 The fragment of (35) requests a continuation of the story introduced
27500 in (34). The common expressions found in interviews are "and", "so",
27600 "go on", "go ahead", "really", etc. If an input expression cannot be
27700 recognized at all, the lowest level default condition is to assume it
27800 is a signal and either proceed with the next line in a story under
27900 discussion or if a story has been exhausted, begin a new story with a
28000 prompting question or statement.
28100
28200 IDIOMS
28300
28400 Since so much of conversational language involves stereotypes
28500 and special cases, the task of recognition is much easier than that
28600 of linguistic analysis. This is particularly true of idioms. Either
28700 one knows what an idiom means or one does not. It is usually hopeless
28800 to try to decipher what an idiom means from an analysis of its
28900 constituent parts. If the reader doubts this, let him ponder the
29000 following expressions taken from actual teletyped interviews.
29100 (36) DR.- WHATS EATING YOU?
29200 (37) DR.- YOU SOUND KIND OF PISSED OFF.
29300 (38) DR.- WHAT ARE YOU DRIVING AT?
29400 (39) DR.- ARE YOU PUTTING ME ON?
29500 (40) DR.- WHY ARE THEY AFTER YOU?
29600 (41) DR.- HOW DO YOU GET ALONG WITH THE OTHER PATIENTS?
29700 (42) DR.- HOW DO YOU LIKE YOUR WORK?
29800 (43) DR.- HAVE THEY TRIED TO GET EVEN WITH YOU?
29900 (44) DR.- I CANT KEEP UP WITH YOU.
30000 In people, the use of idioms is a matter of rote memory or
30100 analogy. In an algorithm, idioms can simply be stored as such. As
30200 each new idiom appears in teletyped interviews, its
30300 recognition-pattern is added to the data-base on the inductive
30400 grounds that what happens once can happen again.
30500 Another advantage in constructing an idiolect for a model is
30600 that it recognizes its own idiomatic expressions which tend to be
30700 used by the interviewer (if he understands them) as in:
30800 (45) PT.- THEY ARE OUT TO GET ME.
30900 (46) DR.- WHAT MAKES YOU THINK THEY ARE OUT TO GET YOU.
31000 The expression (45 ) is really a double idiom in which "out"
31100 means `intend' and "get" means `harm' in this context. Needless to
31200 say. an algorithm which tried to pair off the various meanings of
31300 "out" with the various meanings of "get" would have a hard time of
31400 it. But an algorithm which recognizes what it itself is capable of
31500 saying, can easily recognize echoed idioms.
31600
31700 FUZZ TERMS
31800
31900 In this category fall a large number of expressions which, as
32000 non-contentives, have little or no meaning and therefore can be
32100 ignored by the algorithm. The lower-case expressions in the following
32200 are examples of fuzz:
32300 (47) DR.- well now perhaps YOU CAN TELL ME something ABOUT
32400 YOUR FAMILY.
32500 (48) DR.- on the other hand I AM INTERESTED IN YOU.
32600 (49) DR.- hey I ASKED YOU A QUESTION.
32700 The algorithm has "ignoring mechanisms" which allow for an
32800 `anything' slot in its pattern recognition. Fuzz terms are thus
32900 easily ignored and no attempt is made to analyze them.
33000
33100 SUBORDINATE CLAUSES
33200
33300 A subordinate clause is a complete statement inside another
33400 statement. It is most frequently introduced by a relative pronoun,
33500 indicated in the following expressions by lower case:
33600 (50) DR.- WAS IT THE UNDERWORLD that PUT YOU HERE?
33700 (51) DR.- WHO ARE THE PEOPLE who UPSET YOU?
33800 (52) DR.- HAS ANYTHING HAPPENED which YOU DONT UNDERSTAND?
33900 One of the linguistic weaknesses of the model is that it
34000 takes the entire input as a single expression. When the input is
34100 syntactically complex, containing subordinate clauses, the algorithm
34200 can become confused. To avoid this, future versions of PARRY will
34300 segment the input into shorter and more manageable patterns in which
34400 an optimal selection of emphases and neglect of irrelevant detail can
34500 be achieved while avoiding combinatorial explosions.
34600 VOCABULARY
34700
34800 How many words should there be in the algorithm's vocabulary?
34900 It is a rare human speaker of English who can recognize 40% of the
35000 415,000 words in the Oxford English Dictionary. In his everyday
35100 conversation an educated person uses perhaps 10,000 words and has a
35200 recognition vocabulary of about 50,000 words. A study of telephone
35300 conversations showed that 96 % of the talk employed only 737 words.
35400 (French, Carter, and Koenig, 1930). Of course if the remaining 4% are
35500 important but unrecognized contentives,the result may be ruinous to
35600 the coherence of a conversation.
35700 In counting all the words in 53 teletyped psychiatric
35800 interviews conducted by psychiatrists, we found only 721 different
35900 words. Since we are familiar with psychiatric vocabularies and
36000 styles of expression, we believed this language-algorithm could
36100 function adequately with a vocabulary of at most a few thousand
36200 contentives. There will always be unrecognized words. The algorithm
36300 must be able to continue even if it does not have a particular word
36400 in its vocabulary. This provision represents one great advantage
36500 of pattern-matching over conventional linguistic parsing. Our
36600 algorithm can guess while a traditional parser must know with
36700 certainty in order to proceed.
36800
36900 MISSPELLINGS AND EXTRA CHARACTERS
37000 There is really no good defense against misspellings in a
37100 teletyped interview except having a human monitor the conversation
37200 and make the necessary corrections.
37300 Extra characters sent over the teletype by the interviewer or
37400 by a bad phone line can be removed by a human monitor since the
37500 output from the interviewer first appears on the monitor's console
37600 and then is typed by her directly to the program.
37700
37800 META VERBS
37900
38000 Certain common verbs such as "think", "feel", "believe", etc.
38100 can take a clause as their ojects as in:
38200 (54) DR.- I THINK YOU ARE RIGHT.
38300 (55) DR.- WHY DO YOU FEEL THE GAMBLING IS CROOKED?
38400 The verb "believe" is peculiar since it can also take as
38500 object a noun or noun phrase as in:
38600 (56) DR.- I BELIEVE YOU.
38700 In expression (55) the conjunction "that" can follow the word
38800 "feel" signifying a subordinate clause. This is not the case after
38900 "believe" in expression (56). PARRY makes the correct
39000 identification in (56) because nothing follows the "you".
39100 ODD WORDS
39200 From extensive experience with teletyped interviews, we
39300 learned the model must have patterns for "odd" words. We term them
39400 such since these are words which are quite natural in the usual
39500 vis-a-vis interview in which the participants communicate through
39600 speech, but which are quite odd in the context of a teletyped
39700 interview. This should be clear from the following examples in which
39800 the odd words appear in lower case:
39900 (57) DR.-YOU sound CONFUSED.
40000 (58) DR.- DID YOU hear MY LAST QUESTION?
40100 (59) DR.- WOULD YOU come in AND sit down PLEASE?
40200 (60) DR.- CAN YOU say WHO?
40300 (61) DR.- I WILL see YOU AGAIN TOMORROW.
40400
40500
40600 MISUNDERSTANDING
40700
40800 It is perhaps not fully recognized by students of language
40900 how often people misunderstand one another in conversation and yet
41000 their dialogues proceed as if understanding and being understood had
41100 taken place.
41200 A classic example is the following man-on-the-street interview.
41300 INTERVIEWER - WHAT DO YOU THINK OF MARIHUANA?
41400 MAN - DIRTIEST TOWN IN MEXICO.
41500 INTERVIEWER - HOW ABOUT LSD?
41600 MAN - I VOTED FOR HIM.
41700 INTERVIEWER - HOW DO YOU FEEL ABOUT THE INDIANAPOLIS 500?
41800 MAN - I THINK THEY SHOULD SHOOT EVERY LAST ONE OF THEM.
41900 INTERVIEWER - AND THE VIET CONG POSITION?
42000 MAN - I'M FOR IT, BUT MY WIFE COMPLAINS ABOUT HER ELBOWS.
42100 Sometimes a psychiatric interviewer realizes when
42200 misunderstanding occurs and tries to correct it. Other times he
42300 simply passes it by. It is characteristic of the paranoid mode to
42400 respond idiosyncratically to particular word-concepts regardless of
42500 what the interviewer is saying:
42600 (62) PT.- SOME PEOPLE HERE MAKE ME NERVOUS.
42700 (63) DR.- I BET.
42800 (64) PT.- GAMBLING HAS BEEN NOTHING BUT TROUBLE FOR ME.
42900 Here one word sense of "bet" (to wager) is confused with the offered
43000 sense of expressing agreement. As has been mentioned, this
43100 sense-confusion property of paranoid conversation eases the task of
43200 simulation.
43300 UNUNDERSTANDING
43400
43500 A dialogue algorithm must be prepared for situations in which
43600 it simply does not understand. It cannot arrive at any interpretation
43700 as to what the interviewer is saying since no pattern can be matched.
43800 It may recognize the topic but not what is being said about it.
43900 The language-recognizer should not be faulted for a simple
44000 lack of irrelevant information as in:
44100 (65) DR.- WHAT IS THE FIFTIETH STATE?
44200 when the data-base does not contain the answer. In this default
44300 condition it is simplest to reply:
44400 (66) PT.- I DONT KNOW.
44500 When information is absent it is dangerous to reply:
44600 (67) PT.- COULD YOU REPHRASE THE QUESTION?
44700 because of the disastrous loops which can result.
44800 Since the main problem in the default condition of
44900 ununderstanding is how to continue, PARRY employs heuristics such
45000 as changing the level of the dialogue and asking about the
45100 interviewer's intention as in:
45200 (68) PT.- WHY DO YOU WANT TO KNOW THAT?
45300 or rigidly continuing with a previous topic or introducing a new
45400 topic.
45500 These are admittedly desperate measures intended to prompt
45600 the interviewer in directions the algorithm has a better chance of
45700 understanding. Although it is usually the interviewer who controls
45800 the flow from topic to topic, there are times when control must be
45900 assumed by the model.
46000 There are many additional problems in understanding
46100 conversational language but the description of this chapter should be
46200 sufficient to convey some of the complexities involved. Further
46300 examples will be presented in the next chapter in describing the
46400 logic of the central processes of the model.